Search CORE

302 research outputs found

Architecture-Aware Configuration and Scheduling of Matrix Multiplication on Asymmetric Multicore Processors

Author: Catalán Sandra
Igual Francisco D.
Mayo Rafael
Quintana-Ortí Enrique S.
Rodríguez-Sánchez Rafael
Publication venue
Publication date: 30/06/2015
Field of study

Asymmetric multicore processors (AMPs) have recently emerged as an appealing technology for severely energy-constrained environments, especially in mobile appliances where heterogeneity in applications is mainstream. In addition, given the growing interest for low-power high performance computing, this type of architectures is also being investigated as a means to improve the throughput-per-Watt of complex scientific applications. In this paper, we design and embed several architecture-aware optimizations into a multi-threaded general matrix multiplication (gemm), a key operation of the BLAS, in order to obtain a high performance implementation for ARM big.LITTLE AMPs. Our solution is based on the reference implementation of gemm in the BLIS library, and integrates a cache-aware configuration as well as asymmetric--static and dynamic scheduling strategies that carefully tune and distribute the operation's micro-kernels among the big and LITTLE cores of the target processor. The experimental results on a Samsung Exynos 5422, a system-on-chip with ARM Cortex-A15 and Cortex-A7 clusters that implements the big.LITTLE model, expose that our cache-aware versions of gemm with asymmetric scheduling attain important gains in performance with respect to its architecture-oblivious counterparts while exploiting all the resources of the AMP to deliver considerable energy efficiency

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositori Institucional de la Universitat Jaume I

The CHAIN-REDS Semantic Search Engine

Author: Barbera Roberto
Carrubba Carla
Inserra Giuseppina
Mayo-Garcia Rafael
Ricceri Rita
Publication venue
Publication date: 15/11/2013
Field of study

e-Infrastructures, and in particular Data Repositories and Open Access Data Infrastructures, are essential platforms for e-Science and e-Research and are being built since several years both in Europe and the rest of the world to support diverse multi/inter-disciplinary Virtual Research Communities. So far, however, it is difficult for scientists to correlate papers to datasets used to produce them and to discover data and documents in an easy way. In this paper, the CHAINREDS project’s Knowledge Base and its Semantic Search Engine are presented, which attempt to address those drawbacks and contribute to the reproducibility of science

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

A Review of Lightweight Thread Approaches for High Performance Computing

Author: Balaji Pavan
Castelló Adrián
Mayo Rafael
Peña Antonio J.
Quintana-Ortí Enrique S.
Seo Sangmin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/12/2016
Field of study

High-level, directive-based solutions are becoming the programming models (PMs) of the multi/many-core architectures. Several solutions relying on operating system (OS) threads perfectly work with a moderate number of cores. However, exascale systems will spawn hundreds of thousands of threads in order to exploit their massive parallel architectures and thus conventional OS threads are too heavy for that purpose. Several lightweight thread (LWT) libraries have recently appeared offering lighter mechanisms to tackle massive concurrency. In order to examine the suitability of LWTs in high-level runtimes, we develop a set of microbenchmarks consisting of commonly-found patterns in current parallel codes. Moreover, we study the semantics offered by some LWT libraries in order to expose the similarities between different LWT application programming interfaces. This study reveals that a reduced set of LWT functions can be sufficient to cover the common parallel code patterns andthat those LWT libraries perform better than OS threads-based solutions in cases where task and nested parallelism are becoming more popular with new architectures.The researchers from the Universitat Jaume I de Castelló were supported by project TIN2014-53495-R of the MINECO, the Generalitat Valenciana fellowship programme Vali+d 2015, and FEDER. This work was partially supported by the U.S. Dept. of Energy, Office of Science, Office of Advanced Scientific Computing Research (SC-21), under contract DEAC02-06CH11357. We gratefully acknowledge the computing resources provided and operated by the Joint Laboratory for System Evaluation (JLSE) at Argonne National Laboratory.Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

Estudio de iones de Zr, Cd y Ag mediante espectrometría de ruptura inducida por láser

Author: Mayo García Rafael
Publication venue: Universidad Complutense de Madrid, Servicio de Publicaciones
Publication date: 01/01/2004
Field of study

En este trabajo se ha hecho un estudio de los iones ZrIII, CdII y AgII mediante la técnica de la espectrometría de ruptura inducida por láser (LIBS). Para ello, se ha llevado a cabo la puesta a punto de un sistema de adquisición de espectros de emisión de plasmas producidos por láser y se ha obtenido la respuesta espectral del citado sistema en el rango de los 1900 a los 7000 Å. Se ha realizado un estudio espectroscópico de los diferentes plasmas empleados. De esta forma se han determinado parámetros de ellos tales como su composición, su temperatura o la densidad de electrones y la autoabsorción que presentaban. También, y gracias a los parámetros mencionados, se ha determinado si los plasmas estaban en Equilibrio Termodinámico Local y si eran ópticamente delgados.Se han medido experimentalmente las probabilidades de transición de las transiciones que parten de los niveles 4d5d y 4d5p del ZrIII, de los niveles 5p, 5d, 6s, 4d95s2, 6p, 4d95s5p, 4f, 7p, 5f y 8p del CdII y de los niveles 5s2 y 6s 3D3 de la AgII. Estos experimentos se han realizado con oxido de zirconio y una aleación de Zr-Cu para el caso del ZrIII, con cadmio puro y una aleación de Cd-Zn para el caso del CdII y con plata pura para el caso de la AgII. También se han calculado teóricamente mediante el método de Hartree-Fock relativista con mezcla de configuraciones las probabilidades de transición y las vidas medias de los niveles anteriormente mencionados del ZrIII y CdII

Docta Complutense

Montera: A Framework for Efficient Execution of Monte Carlo Codes on Grid Infrastructures

Author: Llorente Ignacio M.
Mayo-García Rafael Ma
Rodriguez-Pascual Manuel
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 22/03/2013
Field of study

he objective of this work is to improve the performance of Monte Carlo codes on Grid production infrastructures. To do so, the codes and the grid sites are characterized with simple parameters to model their behaviors. Then, a new performance model for grid infrastructures is proposed, and an algorithm that employs this information is described. This algorithm dynamically calculates the number and size of tasks to execute on each site to maximize the performance and reduce makespan. Finally, a newly developed framework called Montera is presented. Montera deals with the execution of Monte Carlo codes in an unattended way, isolating the complexity of the problem from the final user. By employing two fusion Monte Carlo codes as example cases, along with the described characterizations and scheduling algorithm, a performance improvement up to 650 % over current best results is obtained on a real production infrastructure, together with enhanced stability and robustness

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)